March 1st, ’23

Overview

1 | background

2 | questions

3 | approaches

4 | methods

5 | results

6 | future

7 | conclusions

Cercocarpus

Team

Jane Ogilvie (ecological fieldwork, design)

Emily J. Woodworth (pollen morphology, and microscopy)

Sophie Taddeo (geo-spatial, statistics)

Paul CaraDonna (little bees/big picture(s))

Jeremie Fant (all things molecular and tied together)

i.e. an in-house production.

the world is big - 1.1

Primary Roads

really really big - 1.2

  • 5 sampling seasons (May - October)
  • 3 person crews
  • 2 partial support personnel
  • 281 plots
  • area of inference: ~ 900,000 acres
  • 0.363% of Bureau of Land Management administered land

FQI calculated from AIM

funding opportunties - 1.3

## how do we sample the planet? - 1.4

plant species in ecology - 1.6

  • mis-identification is very common
  • mis-identification can lead to nebulous understandings
  • mis-identification can lead to mis-management

Cirsium scariosum

insects species in ecology - 1.7

  • Macro Invertebrates
    • stream ecology bio indicators
      • mayflies, caddisflies, stoneflies
  • Coleoptera
    • soil contamination by metal
  • from bio-indicators to foci?

Macroinvertebrates.org

from organisms to interactions - 1.8

  • Bullet 2
  • Bullet 3

Solidago spathulata & Megachile wheeleri, by A. Litz

metabarcoding - 1.10

  • Barcoding
    • molecular identification of tissue from a single organism
  • Metabarcoding
    • molecular identification of organisms present in a mixed substrate

Five Astragali

barcodes - 1.11

  • Kingdom: Animal,
    • COI (Cytochrome c OxIdase)
    • holding it’s own in Fungi
  • Kingdoms: Fungi + Plant
    • ITS (Interal Transcribed Spacer)
    • holding it’s own in Fungi
  • Kingdom: Plant
    • ITS, rbcL, matK, trnH-psbA
    • not holding much of anything

new barcodes for plants? - 1.12

  • genomics
    • low cost
    • high coverage
    • PCR free?
  • reference library ?
    • old barcode library in development for nearly 20 years
    • Kew PAFTOL
  • angiosperms 353

2 - questions

scale - from plots to continents?

  • many questions will be approached using two perspectives
    • bottom up i.e. plot based data collected by Jane
    • top down i.e. computer based data generated by me
  • fine scale data serving to as ground truth to the computer generated models

can we predict what is flowering in time & space? - 2.1

  • which species are present in an area?
  • when are these species flowering in an area?
  • diverse clades provide challenges for identification
  • species often diverged in ecological traits

Rhododendron sp. Hengduan Mtns., by Qin Li

do a353 work as barcodes? - 2.2

  • ‘universal’ markers for phylogenomics
  • usable in all flowering plant clades
  • first comprehensive genus level phylogeny of flowering plants
  • shoot the moon; meta genomics first

are a353 semi-quantitative? - 2.3

Do the number of sequence reads reflect the amount of biological material in a sample?

3 - approaches

predict what is flowering; Time & Space - 3.1

  • no longer any funding for floristics; few Floras maintained, fewer written
  • essentially no funding remains for alpha taxonomy
  • little to no funding natural history
  • how do we monitor ecological shifts under climate change?
    • geographic ranges
    • flowering time
  • back to the sheets!

FPNW 2nd

custom sequence databases; a353 as barcodes? - 3.2

  • reduce number of species present in database
    • reduce computational requirements
    • increase likelihood of relevant matches across loci
    • reduce false positives for semi-quantitative inference

queen bee pollen loads: a353 as barcodes? - 3.3

  • DNA extracted from corbiculae loads

    • a ‘pollen basket’ for holding grains collected grains
  • variable in size, but generally many tens of thousands of grains

    USFWS

identify pollen grains; a353 semi-quantitative? - 3.4

4 - methods

  • field work
  • spatial
  • temporal
  • morphologic
  • laboratory
  • bioinformatic
  • post-classification

study system & field work - 4.1

pollen morphological identification 4.2

workflow

pollen reference library 4.2.1

  • ca. 110 species
  • 60 novels species added
  • shared
  • many more species to add to key!! (60 +, mostly un- sampled families)

pollen corbiculae loads 4.2.2

  • aliquot from same sample used for molecular
  • stained by fuchsin jelly with stirring
  • transects
  • rarefaction curves
    • richness
    • abundance

Corbiculae Sample

molecular barcoding 4.3

  • Angiosperms 353…

spatial analysis 4.3.1

  • 2-stage approach
    • 1st: distance search of records from museums & plot based data (e.g. Forest Servce)
    • 2nd: species distribution modelling

plant species, distribution modelling 4.3.1.1.

  • develop a candidate species list for barcoding
  • download all herbarium records from a distance exceeding the study area
  • compare to known species at field site
  • logistic regression
  • bootstrapped samples of records

species distribution modelling - 4.3.1.2

sdm evaluations - 4.3.1.2

  • in pipeline, True skill statistics
    • works well over wide range of occurrence records

temporal modelling - 4.3.2

  • reduce herbarium records to study domain
  • thin records to analogous ecoregions
  • trim start/end records
  • identify major phenological cues, subset records to similar areas

temporal modelling subset - 4.3.2.1

SPATIAL SUBSET PICTURE

temporal modelling distributions - 4.3.2.2

barcode references library - 4.4

genomics work - 4.4.1

  • Plant Reference Library
    • herbarium & silica dried
    • CTAB, some DNEasy
  • Pollen Extraction
    • ‘novel’ CTAB / SDS extraction
  • Both
    • clean up Cytiza, size selection SPRI
    • enzymatic fragmentation

plant genomic reference dna - 4.4.2

  • 38 species to sequencing
  • 13 species duplicate
  • 24 silica gel dried, 14 herbarium leaf tissue (RM, ID, IDS)

tissue from Rocky Mountain Herbarium

pollen genomics dna - 4.4.3

  • 54 Initial samples for extraction
  • 44 samples underwent all steps and were analyses

hyb-seq

barcoding informatics - 4.4.4

  • trimmomatic, remove tags, select sequences > 31 bp in length
  • Kraken - qualitative identification
  • Bracken - quantitative identification
  • BLAST followup

metabarcoding - 4.5

sequence database generation - 4.5.1

  • Kew Tree of Life ~ ### taxa
  • US ~ xx TAXA

database

sequence assignment - 4.5.2

semi-quantitative evidence 4.5.3

5

results

field work

  • 723 floral visitation observations (!)
  • 36 unique plant species involved
  • 64 corbiculae loads from Queens

sdm candidate species ????

  • downloaded some 112k records
  • mostly trees from forestry surveys
  • bootstrap re-sampled to reduce effects of collection ‘hotspots’
  • non-present taxa begin nearly immediately…
  • real occurrences taper off quickly

database

sdm evaluations - computational - 5.2

Logistic regression assessing accuracy of SDMs; witheld data
Metric Value Metric Value
Accuracy (Training) 83.75 F-Score 0.84
Accuracy (Test) 84.00 AUC 0.92
Recall 81.03 Concordance 0.92
True Neg. Rate 86.97 Discordance 0.08
Precision 88.04 Tied 0.00

sdm evaluations - 5.3.1

  • able to compare to a localized vascular plant checklist
    • not everything w/ vouchers…
    • not everything w/ vouchers…
  • able to remove nearly all species from the upper (alpine) & lower (sagesteppe) life zones
ml lm
ensembles 493 473
true + 362 286
true - 33 55
false + 64 41
false - 34 93

sdm evaluations - 5.3.2

  • We were interested in comparison to the Valleys.
  • Plot Level, 117 species total (109 eligible for modelling…)
    • ML: 105 (89.7% (96.3%))
    • LM: 102 (87.2% (93.5%))
  • Able to detect virtually all species recorded on plot

coarse phenological modelling - 5.4.1

  • strong agreement between first and peak flower periods with historic data
  • good agreement between last flower date
  • no agreement with duration! - species do not ‘line up’

flower dates

coarse phenological modelling - 5.4.2

  • similar results with weekly data across all field sites combined
  • tau values lower than over longer term data

flower dates

metabarcoding - 5.5

sequence database generation - 5.6

  • found existing data for 130 species on NCBI - SRA
  • novel sequence data for 25 species, varying number of loci
  • whole ‘ring’ to be completed within the year

Species in Sequence DB

sequence assignment - 5.7 - I

  • trimmomatic (discard short reads)
  • Kraken (many false positives)
  • Bracken (many many false positives)
  • Blast (fewest false positives)

Three Initial Networks

sequence assignment - 5.7 - II

Post classification of Sequences via Taxonomy and Ecology, top 15 most abundant reads
Condition No. Class. Prcnt. Class. Total Seqs Rank
A 143 21.0 32.0 Species
B 205 30.1 10.5 Species
C 5 0.7 0.4 Genus
G 29 4.3 7.8 Species
H 280 41.2 47.9 Genus
None met 18 2.6 1.4 Multiple

sequence assingment - 5.7 - III

  • Naive BLAST from custom databases 26% accuracy
  • post-classified BLAST using temporal filters to create genera monogeneric in space and time 44%
  • BLAST, creates many false positives

sequence assingment - 5.7 - IV

  • conceptually similar to the automated process
  • utilized high resolution occurrence and phenology data
  • utilized morphological and molecular data
  • no linear operation or rule of precedence
  • classified all sequences to species

semi-quantitative evidence - 5.8

  • some relationship exists
  • requires further work by someone else

Counted grains

final floral feeding structure - 5.9

6

Discussion

conservation implications - 6.1

  • a hidden plain text paragraph in the discussion.
  • collaboration with Ken Holsinger, Jedd Sondergard @ BLM Montrose

conservation implications - 6.1 - II

  • historic vegetation treatment removals
    • Delphinium, Astragalus & Oxytropis
  • altered fire cycle
    • Delphinium, Mertensia
  • stream channelization / wetland removal
    • Mertensia
  • seed species, where missing
  • allow return of historic fire cycle, when appropriate
  • reintroduce beavers, or if unavailable than analogs

7

future

metabarcoding; computational approaches - 7.2

  • qualitative:
    • search for variable loci
    • flanking regions and pop gen

  • quantitative:
    • read re-assignment based on phylogentic distance
    • read re-assignment in bayesian framework

metabarcoding; new data sets? - 7.1

  • (more) West Slope
  • artificial mixtures
    • leaf tissue (counted cells)
      • easier to collect
    • pollen loads (counted grains)
      • more popular
  • Gunnison Sage-Grouse scat
    • BLM habitat assessment data
    • opportunistic collections

bombus; trends in perennial bunchgrasses - 7.3

8

conclusions

promising

acknowledgements

two super fantastic technicians for the field seasons I worked during school, made my life SO easy
find them funding and recruit them, and then give me a finders fee.

  • Dani Yashinowitz B.S. (Yellowstone National Park, botanist & crew lead, Whitebark Pine Surveys (!!!))
  • Hannah Lovell B.S. (Telluride Mountain Resort, and in search of work)

acknowledgments

Employment: Yingying Xie, Josh Scholl, Sam Isham, Kelly McMillen, Kay Hajek, Linda Vance, Cassandra Owen, Ken Holsinger

Project: Nyree Zerega, Pat Herendeen, Hilary Noble, Zoe Diaz-Martinez, Angela McDonnell, Elena Loke, Ian Breckheimer, Ben Legler, Ernie Nelson, Charles (Rick) Williams, D. Knoke, L. Brummer, J. Boyd, C. Davidson, I. Gilman, M. Kirkpatrick, S. McCauley, J. Smith, K. Taylor, & C. Williams. David Giblin, Mare Nazaire, Sarah Burnett, Lauren Price, T.C.H. Cole, Eliot Gardner.